{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# DRKG Relation Embedding Similarity Analysis\n",
    "This nodebook shows how to analyze the trained relation embeddings. \n",
    "\n",
    "In this example, we first load the trained embeddings and map them back into original relation names. And then apply three methodologies to analyze these embeddings:\n",
    " - Project the embeddings into low dimension space and visualize their distribution.\n",
    " - Use cosine distance to analyze the similarity between each relation.\n",
    " - Use frobenius distance to analyze the similarity between each relation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import os\n",
    "import csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading Relation ID Mapping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of relations: 107\n"
     ]
    }
   ],
   "source": [
    "rel2id = {}\n",
    "id2rel = {}\n",
    "with open(\"./train/relations.tsv\", newline='', encoding='utf-8') as csvfile:\n",
    "    reader = csv.DictReader(csvfile, delimiter='\\t', fieldnames=['rel','id'])\n",
    "    for row_val in reader:\n",
    "        id = row_val['id']\n",
    "        relation = row_val['rel']\n",
    "\n",
    "        rel2id[relation] = int(id)\n",
    "        id2rel[int(id)] = relation\n",
    "\n",
    "print(\"Number of relations: {}\".format(len(rel2id)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading Relation Embeddings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(107, 400)\n"
     ]
    }
   ],
   "source": [
    "rel_emb = np.load('./ckpts/TransE_l2_DRKG_0/DRKG_TransE_l2_relation.npy')\n",
    "print(rel_emb.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## General Relation Embedding Clustering\n",
    "Here we use t-SNE to convert relation embeddings into low dimension space and visualize their distribution."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.\n",
      "The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.\n",
      "The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.\n",
      "The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAdUAAAD4CAYAAAC6/HyrAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3de0CUVf4/8PeZGe4gchNk5CLpyEUMEskss3Iz2zIt16xdtWzF3da+baWVfd2srXWzbxfb0rbUdNFyV+1qbj+zMt3KvICgpCC6pCCCV+SqwMxzfn/gICj3eWaeGXi//rF55plnzswuvDnnPOdzhJQSREREZDud1g0gIiLqLhiqREREKmGoEhERqYShSkREpBKGKhERkUoMWrxpcHCwjI6O1uKtiYhcVmZm5mkpZYjW7aDWaRKq0dHRyMjI0OKtiYhclhDiqNZtoLZx+JeIiEglDFUiIiKVMFSJiIhUosmcKhERqSMzM7OPwWBYDmAw2FGyNwXAT2azecbQoUNPtnQCQ5WIyIUZDIblYWFhcSEhIWU6nY7F3O1IURRx6tSp+NLS0uUA7mrpHP5VQ0Tk2gaHhIRUMFDtT6fTyZCQkHI0jAq0fI4D20NEbSnaBXz3WsO/RB2nY6A6zsXvutXs5PAvkTMo2gWk3wVY6gC9O/DABiAiVetWEVEnsadK5AyOfNcQqNLS8O+R77RuEVGHHTx40H3gwIEJlx+fPHlyVGZmpqcWbdIKe6pEziB6ZEMP1dpTjR6pdYuIbLZ27VpVKkCZzWYYDK4RV+ypEjmDiNSGId9b5nHol+zu+8OnfV7elBf2/eHTPmpd02w245577ok2mUzxY8eOjamsrNSlpqYO+s9//uMNAO+++26gyWSKHzhwYMLDDz9stL7uN7/5TeTgwYPjBgwYkPD444+HW48bjcbEOXPm9B06dOigpUuXBg4ePDhu48aNfgAwa9Ys4//8z/8Yr2yF9lwj+ol6gohUhinZ3feHT/s89I/dJrNF0b33/c/KigeH5d8wILja1useOXLE89133z0yZsyY6kmTJkW/8sorIU2ec3v++eeNmZmZuSEhIeaRI0eaVq9e3Xvq1KnnXn/99eLQ0FCL2WzGiBEjBu3cudPr2muvPQ8Anp6eSmZm5kEASE1Nrbn33nuvqq+vL9qyZYt/VlZWrq1ttgf2VImcEe8EJjv54fBpP7NF0SkSMFsU3Q+HT/upcd2wsLC6MWPGVAPA1KlTz2zfvt3X+tz333/vM3z48Mrw8HCzm5sbJk+efHbbtm2+AJCenh4YHx8fFx8fH3/o0CHPvXv3Ns7BTps2rcz63ykpKRfuvffeM/fdd9+A5cuX/+zp6emUdzyzp0qkpaJdDTclRY+81EvlncBkR9cPCK587/ufFbNF0Rn0OuX6AcGValxXCNHqYylbzr+8vDz3xYsXh17swVomTpwYfeHChcbOnp+fn9L0/P3793v5+flZSkpK3ACcV6PdamNPlUgr1vDcsqDhX2uvtNmdwLXA1pfYYyXV3DAguHrFg8PyfzfqqmK1hn4BoKSkxP3rr7/2AYA1a9YEjhgxosr63I033li9c+dOv5KSEoPZbMb69esDb7rppqqysjK9l5eXEhgYaCkqKjJs3brVv7Xrp6en9z579qxhy5YteXPmzIk8ffq0Xo12q42hSqSV1pbRWO8EFjpAKsB/tzYPXVtwWJnQEKxPj40tVStQASAmJubCihUrgkwmU3xZWZlhzpw5p6zPRUVF1c+fP7941KhRpri4uIQhQ4bUTJky5dx11113fvDgwTUDBw5MmDp1avTQoUOrWrp2SUmJ4bnnnuuXnp5+ZMiQIbUzZsw4OXPmzAi12q4m0Vq33J5SUlIkNymnHq+tYd6iXQ091P9uBaAAQt9wZ/DI2fZ5P3IJQohMKWVK02N79+49cvXVV5/Wqk090d69e4Ovvvrq6JaeY0+VqD326t21tYwmIhW46RnA4NEQqF1Zu3p5uztTYII9WqIuUeVGJSHE4wBmAJAAcgBMl1JeUOPaRJqyd++urWU01tA98h3gFXQpBDvy/i21u6MFJtijJeoym3uqQggjgEcBpEgpBwPQA7jP1usSOQWtywdGpDYE6hezgS1/6fjcakvt7miBCa0/M5ELU2tJjQGAlxCiHoA3gOMqXZdIW1qXDyza1RCoirnhsaX2UkC2pbV2d6TAhNafmciF2RyqUspiIcSrAArRsG5os5Rys80tI3IGTYdgm64ldZQj3wGK5dJjoetYyNnSbq0/M5ELszlUhRABAMYD6A/gHID1QogpUsr3LztvJoCZABAZGWnr2xI5jpblA72C0HCrwkXXPdLxttjSbpZMJOoSNe7+/QWAn6WUp6SU9QA+BjDi8pOklEullClSypSQkJArLkJELTh/Bpd+THWAZy8tW0PUIr1ePzQ2NjZ+wIABCYMGDYp//vnnQy2WSyMs3377rXdqauqgqKiowfHx8XE33XTTgF27dnkBwBNPPBE+f/78UACYOHFitNFoTBw0aFB8dHT04Lvvvjv6559/drNex2g0JppMpvjY2Nh4k8kU//777/d2+IdthxqhWghguBDCWzTUpRoNwCkLHRO5nOiRl5bVGDxsX1ZDZAceHh5KXl7egcOHD+/fsmVL/ubNm/3nzJkTDgBFRUWGKVOmXLVgwYJjR48e/enAgQO5zzzzTOnBgwc9WrrWX/7yl2MHDx48UFBQ8FNSUlLNzTffPOjChQuNNQ+3bduWn5eXd2D9+vX/feqpp5yuAITNoSql3AngQwB70LCcRgdgqa3XJeqRLg9BW7aEa60MIlHBVh98/XwYCraqtvWbldFoNC9fvvzIypUr+yiKgldffbXPvffee+bWW29trN502223VU2dOvVcW9fR6XR47rnnTgYHB9d/+OGHV5QvPHfunL5Xr16Wll6rJVXu/pVSPgfgOTWuRdRjtbY+tKvzm60tq6GerWCrDz641wSlXocf31bwm3X5iLlJtXKFABAfH1+nKAqKi4sNubm5XtOmTTvT1WsNGTKkJjc3t3HnmlGjRpmklOLYsWPuK1asKFCnxephRSUiZ6H2+tDGGsJdrMhE3VPBVj8o9TpIBVDqdSjYqsrWb5drrQTukCFDYmNiYhKmT5/eoaHby6+zbdu2/EOHDu3PyMg4MGfOnMjy8nKnyjGnagxRj6Z2CNoydEzdV8xNldC5KRA6QOemIOYmVbZ+a+rAgQPuer0eRqPRHBcXdz4zM9Pb+ty+ffvynn322eMVFRUd2mUmJyfHOz4+/opt3hISEmqDgoLq9+zZ49nS67TC/VSJnIU91odyaQxdLuamavxmXT4Ktvoh5qZKtYd+jx8/bkhLS4uaPn36SZ1Oh9mzZ58cPnx43C9/+cty67xqdXV1ux06RVHw17/+tc+pU6fcJk6cWHH588XFxYZjx455DBgwoE7N9tuKoUrkTBiC5AgxN1WrGaa1tbW62NjYeLPZLPR6vZw8efKZ55577gQAREZGmlevXl0wd+7cfr/97W/dgoKCzAEBAebnn3++xcp7f/rTn/otXLiw74ULF3TJycnVW7ZsOejp6dk4Bjxq1CiTTqeD2WwW8+fPPxYREWFW63OogVu/ERG5CG795hy49RsREZEDMFSJiIhUwlAlIiJSCUOViIhIJQxVIiIilTBUiYiIVMJQJSIim7S19dvGjRv9/Pz8kuLi4uL79++fMHPmzH7W1zXd9s3KaDQmlpSUGICGHW7GjRvXv1+/fokJCQlxSUlJsatWrerd9LrWbeBGjBhhKi4ublZ7YfTo0VclJSXFNj32xBNPhHt5eSU3Pdfb2zu5pf9eu3atf1RU1OBDhw65d/S7YKgSEZFN2tr6DQBSUlKqcnNzD+Tk5Bz46quv/Ddv3tzu7jiKomDcuHEDRo4cWXXs2LGc/fv3565bt66gqKioMeBSUlKq8vLyDuTn5x9ITk6ufvXVV/tYnzt9+rR+//79PhUVFfq8vLxmodi7d2/zX/7yl2ZhfrnPPvvMb86cORFffPHFoYEDB3a4ahNDlYioh9lxfIfPG5lvhO04vsPuW7815evrKxMSEs4XFha22/P7/PPP/dzc3ORTTz11ynrMZDLVzZs37+Tl5yqKgsrKSn1AQEBjdaXVq1cH/OIXvzh39913n01PTw9sev79999/ZsOGDYEnTpxosf7wpk2bfGfNmhW9YcOGwwkJCbUd+NiNGKpERD3IjuM7fGZ9M8u08qeVxlnfzDLZI1ibbv3W9PipU6f0P//8s8eYMWPaLeKfk5PjNWTIkJq2zsnIyPCNjY2NDw8PH/Ldd9/5PfLII42VpdavXx84ZcqUsw888MDZjz76qFmo+vr6Wu6///7TCxcuvKK3WldXJyZPnjzgo48+OpycnHyh/U/bHEOViKgH2VGyw8+smHUKFJgVs25HyQ67b/2WkZHhazKZ4o1G49Vjxowpj4yMNAOAEKLFOrktHZ86dWrkoEGD4gcPHhxnPWYd/i0tLd3361//+swjjzzSD2iYiz169KjHmDFjqoYMGVJrMBjk7t27m+1mM3fu3JPr1q0LOnv2bLMcdHNzk9dcc03VO++8E9yVz81QJSLqQYb3HV5p0BkUHXQw6AzK8L7D7br1G9AQfvn5+QcyMjL2p6enh2zfvt0LAIKCgsxlZWXNerPV1dX64OBgS2Ji4vl9+/Y1bhm3evXqwq1bt+Zffr7VxIkTz+3cudMPANLT0wMrKir0ERERiUajMbG4uNhj9erVzXqrwcHBlrvvvvts03lYABBCYMOGDQXZ2dk+c+fODevsZ2eoEhH1IMPDh1cvGb0kf/rg6cVLRi/JHx4+3K5bvzU1ZMiQ2j/+8Y8lL730UhgAjB49uurLL7/0Lysr0wFAenp679jY2BqDwYBx48ZV1tbWipdffjnE+vqqqqpWM+vbb7/1jYqKqgWADz/8MPCTTz45VFxcnFNcXJyzc+fOA59++mng5a+ZN2/eifT09BCLxSKaHvfz81M2bdp06MMPPwxatGhRp3qs3PqNiKiHGR4+vFrNMG1r67fLzZ49+1RMTExYXl6e+7XXXns+LS3t5PDhw2OFEAgKCqpfsWLFEQDQ6XT4/PPP/ztr1qyIN998MywwMNDs7e1tef75549Zr2WdU5VSws/Pz7JixYojBw8edD9+/Lj7Lbfc0vj5YmNj63x9fS1btmxpNn/ct29f8+2331723nvvXTG3Ghoaatm0aVP+qFGjYkNCQsxTpkw515Hvglu/ERG5CG795hy49RsREZEDMFSJiIhUwlAlIiJSiSqhKoToLYT4UAiRJ4TIFUJcp8Z1iYiIXIlad//+DcAmKeWvhBDuALzbewEREVF3Y3OoCiF6AbgRwIMAIKWsA9Dh4sNERETdhRrDvzEATgFYKYTIEkIsF0JcUUtSCDFTCJEhhMg4derUlVchIiKX1NoWbRs3bvQTQgxds2aNv/Xcm2++ecDGjRv9ACA1NXVQdHT04NjY2PiYmJiEV199tbHQgtFoTDSZTPEmkyl+2LBhg/Lz8zu8/ZqW1AhVA4BrAPxdSpkMoBrA3MtPklIulVKmSClTQkJCLn+aiIhcUHtbtIWGhta//PLLfVt7/apVqwry8vIO/Pjjj3kvvPBCvwsXLjRWN9q2bVt+fn7+gRtuuKFy/vz5rV7DmagRqscAHJNS7rz4+EM0hCx1U6UF5cjcdASlBeVaN4WIuqB6+48+J197Pax6+48271DT3hZtcXFxNX5+fpZPPvmkV1vXqaio0Ht5eSkGg+GKikTXX399VUlJiZutbXUEm+dUpZSlQogiIcQgKeVBAKMBHLC9aeSMSgvK8dmiLFjMCvQGHcY/noywGP/2X0hETqF6+48+Rb//vUmazbqz6elKxDvv5PuMuK7LJQs7skXbn/70p5Jnn33WePfdd1dc/ty0adNi3N3dlcLCQs8XX3yx0GC4Mpa++OIL/3HjxnWoTKDW1Fqn+j8APhBC7AOQBOCvKl2XnExxfhksZgVSAhaLguL8Mq2bRESdUP3jj37SbNZBUSDNZl31jz+quvVbS1u0jR07tgpo2Pz78vNXrVpVkJ+ff6CgoGDf4sWLw5rOnY4aNcoUGBh49XfffdcrLS3trJrttBdVQlVKmX1xvnSIlHKClJK/abspoykAeoMOQgfo9ToYTQFaN4mIOsHnuusqhcGgQKeDMBgUn+uus2nrt45u0fbMM8+ULFiwoNV50fDwcPPgwYNr/vOf/zQOSW/bti2/sLBwn8lkOj979uxwW9rpKKyoRJ0SFuOP8Y8n49q7Yjj0S+SCfEZcVx3xzjv5Qb/9bbGtQ78AOrxF2z333FNRXl6uz83NbbGOQWVlpW7//v3egwYNqm163NfXV7799ttFH330UdCJEyf0trTVEbj1G3VaWIx/tw7T0oJyFOeXwWgK6Nafk3ounxHXVdsaplYd2aLN6umnny6ZMmXKgKbHpk2bFuPp6anU1dWJ++677/TIkSOvmJ+Nioqqv+uuu86++uqrfV555ZUSNdptL9z6jagJ3ohFzoxbvzkHbv1G1EG8EYuIbMFQJWqCN2IRkS04p0rUhPVGLM6pElFXMFSJLtPdb8Qi9dVkZaFm1254pw6Dd3Ky1s0hDTFUiYhsUJOVhcLpD0HW1UG4uyNy5QoGaw/GOdVuLvtkNpbnLEf2yWytm0LULdXs2g1ZVwcoCmR9PWp27da6SaQhhmo3ln0yG2mb0/DWnreQtjmNwUpkB/re/oBOB+h0EHo96o8fR01W1hXn1WRl4fS7S1t8ztV5e3s365q/+eabQdOmTYts6zUbN270++qrrxqrJ/3f//1fyOLFi4PUbNfBgwfd33nnnUA1r9keDv86mCPnXjJOZKDOUgcFCuqVemScyEBSnyS7vqc9lBaUI29Hw3rv2OF9Od9JDtPaz6v1uL63P068tBCwWACdDlJRcG7tWpxbvx6BD02H3+jRKP/0M5hPn0bVd98BZjOHiC/asmWLn6+vr+XWW2+tBoCmu9yo5dChQx5r164N/P3vf++wusEMVQdy9NxLSmgK3PXuqFfq4aZzQ0poSvsvcjKlBeX49PU9sJgbipTk/VCCCbOvYbCS3bX289r0OHQ6QFEAKS/9CwCKgrPL38PZlf9oCNwmrEPEWoZqUe5Zn2N5ZX79YgMqI+ICVams1Jrjx48bpk+fHlVcXOwOAK+//nphVFRU/apVq0J0Op1ct25d0BtvvFG4efPmXr6+vpYXXnjhxPbt270efvjhqPPnz+uioqJq16xZcyQkJMSSmpo6aOjQoVXff/99r8rKSv0777xzZOzYsVVmsxmzZs3q98MPP/jV1dWJtLS0k08++eTpefPmGQsKCjxjY2Pj77///tPPPffcSXt+VoCh6lAtzb3Y8wcrqU8Slo1ZhowTGUgJTXHJXmpDMYZLVb8sikRxfhlDleyutZ/XpscBNASrEBB6PWR9/aVgBa4IVAgB4eYG79RhjvsglynKPevz7yV7TRaL1O39plC5Y9bV+bYGa21trS42Njbe+ri8vFx/6623lgPA7373u4gnnnjixG233VZ16NAh99tuu21gQUHB/mnTpp2yhigAbN68uXG/1QcffLD/okWLCu+4446qxx57LPzpp58OX7FiRREAmM1mkZOTk7t27Vr/F154IXzs2LH5b7zxRrC/v7/lp59+yj1//rwYNmxY7Lhx4yoWLFhQ/Nprr4V+++23h235fJ3BUHUg79RhEO7ukPX1DvvBSuqT5JJhatVQjEE0BqteJ1iQgRyitZ/Xy4+HPjMXlnPl8E4dhspvvsHZ5e9duohefylYDQb0njgR/hPGa9pLPZZX5mexSB0koFik7lhemZ+toerh4aHk5eU17qP95ptvBmVkZPgAwA8//NDr0KFDXtbnqqqq9GVlZa3ez3PmzBl9ZWWl/o477qgCgLS0tDOTJk2KsT4/adKkMgAYMWJE9ZNPPukOAF9//XWvvLw87w0bNgQAQGVlpf7AgQOe7u7uDq/Dy1B1IO/kZESuXMH1bJ0QFuOPCU9cwzlVcrjWfl7b+jn2Tk6Ge0QEKjd/Bb8xt8LDZEL5p58BgOZhatUvNqBy7zeFimKROp1eKP1iA2za+q09UkpkZGTk+vr6qhJwnp6eEgAMBgMsFou4+B7itddeK5w4cWKzTdA3btyo6l6xHcFQdTDv5GSn+MFyJSzGQFpp7ee1rZ/jgMmTETB5crNznUlEXGD1HbOuznfUnOoNN9xQ8fLLL/d58cUXTwDA9u3bvUaMGHHez8/PUlFRccVWbkFBQZZevXpZNm3a5Dt27Niq9957L+i6666raus9br311vK///3vIXfeeWelh4eH3Ldvn0d0dHS9v7+/paqqyqHbxXFJDRFRDxMRF1h93d1Xldo7UAFg6dKlRXv27PExmUzxV111VcLixYtDAGDixInn/v3vf/eOjY2N37Rpk2/T16xcufLnp59+up/JZIrft2+f18KFC4+39R6PP/746djY2AuJiYlxAwcOTEhLS4uqr68Xqamp5w0Ggxw0aFD8n//85z72/JxW3PqNiMhFcOs358Ct36hLao9WoOLbItQerWj/ZGpXaUE5MjcdQWlBudZNISI74Zwqtaj2aAVOL8+BNCsQBh2CZyTCI6pX+y9UQWlBebfbJYabnxP1DAxValFtQTmkWQEkIM0KagvKHRKq3TV8Wtr8vDt8LiJqjsO/1CKPGH8Igw4QgDDo4OGgAGgpfLoDbn5O1DOo1lMVQugBZAAollLeqdZ1SRseUb0QPCOxoYca4++woV9r+FgsSrcKH25+TtQzqDn8+0cAuQAc89uX7M4jqpfDwtSqO4cP19sSdX+qDP8KIfoBuAPAcjWupxXenekcwmL8MXRsNAOIyEVYt347ePCguxBi6IIFCxrXhE6bNi3yzTffDJo6dWpkbGxs/FVXXZXg6el5TWxsbHxsbGz8ypUrAwCgvr4eAQEBV8+aNcvY9Nq1tbXiD3/4gzEqKmrwwIEDExITE+PWrVvXa8iQIbGxsbHxffv2TQwICLjaer2DBw+6O/bTN6dWT/UNAE8BaLUklBBiJoCZABAZ2eY2e5rorjfIEBE5UmBgoPndd9/tM3v27FPWkoIAsHr16kKgIXjvvPPOgU1rBQPAxx9/7N+/f//aDRs2BLz11lvFOl1Dn+/xxx8PLy0tdcvLy9vv5eUli4qKDF9++aXfvn378oBLdYZXrVpV6MCP2Sqbe6pCiDsBnJRSZrZ1npRyqZQyRUqZEhISYuvbqq673iDT03BtLVH7juZk+3y35h9hR3Oyfdo/u3MCAwPNN9xwQ+WSJUs6teH4P//5z8A//OEPJ8LDw+u2bNniAwCVlZW6NWvWhCxfvrzQy8tLAkBERIR5xowZTvsLWo3h3+sB3CWEOALgXwBuEUK8r8J1HYp3Z7o+69rais1HcHp5DoOVqAVHc7J9Pnn5z6ZdGz4yfvLyn032CNb58+eXLF68ONRsNnfo/KqqKrF9+3a/yZMnl0+aNOns+++/HwgABw4c8Ojbt29dYGCgonYb7cXmUJVSPiOl7CeljAZwH4AtUsopNrfMwaw3yFx7V4xDh34dMY9r795bTVYWTr+7FDVZWXa5fke1tLaWiJorzMn2s5jNOkgJxWzRFeZkq76TS2xsbF1SUlL1u+++G9iR89etW9d7+PDhlX5+fsqUKVPKNm3aFNDRQHY2LP7QhKPvznTEPK69KyPVZGWhcPpDkHV1EO7uiFy5ot1dOWqysuyyHZZ1ba31s3Z2bW3t0QqHLyEicrTIxKTKzC8+UxSzRacz6JXIxCS7bP02f/780nvvvfeqa6+9tt3r/+tf/wrMzMz0NRqNiUDDJucbN270Gz16dHVJSYl7WVmZLiAgwCV6q6oWf5BSbuUa1Y5zxDyuvXtvNbt2Q9bVAYoCWV+Pml272z4/KwtHH3gQ59auxbm1a1H4wIOq9XCta2t7jYnu9B8PHDqmniIqMan67qefyx921z3Fdz/9XH5UYpJddqpJTk6+MHDgwPPffPNNm3/dnj17VpeRkeF77NixfcXFxTnFxcU5CxcuLFyzZk2gn5+fct99951OS0uLvHDhggCAo0ePur399tsd6gFrgRWVNOSIeVxbKiO1NTSdebQMS749jKLIWAh3d0Cvh3Bzg3fqsDavWbNrN1Bf3/i4I0Hc7PXtDDV7RPVCr5sjOt3T5NAx9SRRiUnVI3/9YKm9AtXq2WefLTlx4kSbS1zef//9gBEjRlRab0QCgPvuu+/cV1991fv8+fPijTfeKA4ODjabTKaEgQMHJowbN+6q0NBQpx0b5tZvGnNE8fiuDGu2NTSdebQMv1m+A3VmBe4GHf45wgcRhXnwTh3WoaHfow88CNTVAUDDkHH6Pzo0BNyVoeaO0nIDAaKO4tZvzqGtrd84p6oxR8zjdqUyUlsF4HcUnEGdWYEigXqzgu2e4Zj1uxs7dF3v5GREpf+jS3OqLQ01qzYfq1FZRiLqXhiq1KK2avAOjwmCu0GHerMCN4MOw2M6tRwNAOAWHt6hnm1T3qnDINzdIevrOzTU3FlalGUkou6FoUotaqsG79CoAHwwYzh2FJzB8JggDI3q+FxwR4dwj+fnomh/DiISEhFuigPQ0MuNXLmioYfayUAmInIEhiq1qq2h6aFRAZ0KU6uODOEez8/F+hfnwWI2Q28wYNKzC5oFqzOEaUuhT0TEUCWH6sgQbtH+HFjMZkhFgcVsRtH+HKcKrrZCn4h6NoYqAVDvLuSarKw2h2c7MoQbkZAIvcHQGFoRCYktvpdWvUVnD30i0g5DlTpU2akjAdbR+dJzPp4o6tMbET6e8G7hOuGmOEx6dkGb76dlb7Gjoe8I2SezkXEiAymhKUjqk6RZO6hne/rpp8M++uijIJ1OJ3U6Hfz9/c3l5eWGmpoaXVlZmcFoNNYBwFtvvXV03rx5/U6ePOnm4eGhuLm5yaVLlx4ZMWLEeQAwGo2JGRkZuX379jULIYbOmDHjxLJly44BwPz580Orqqr0r7/++nEAePvttwPfeOONMEVRhF6vl0lJSdVLliw5FhwcbNHum2CoEtpePgN0PMBsnS9tKszUN7QAABQSSURBVNwU12ZI2tJbtLWH25HQd4Tsk9lI25yGOksd3PXuWDZmGYOVHO7rr7/2+fLLL3vn5OQc8PLykiUlJYba2loRHR1dv3HjRr/XXnst9Ntvvz1sPX/evHlYtWpVwY033ljzt7/9LWjOnDn9tm/ffujy67q7u8svvvgioKSkpLRv377Nij18+OGHvZYsWRL65ZdfHurfv3+92WzG4sWLg4qLiw1ahyorKlG7lZ1aCrCWWOdL26qu1NFrtcfaWxQ6Xad6i9ZQ/2Hd+1j/4jwcz8/t0vuHm+Jw7d33ajrsm3EiA3WWOihQUK/UI+MEC6pQx1w4XOZTvunnsAuHy2zeoaa4uNgtMDDQbK2I1LdvX3N0dHR9e68DgBtvvLG6tYpLer1eTps27dRf//rX0Mufe+mll/ouXLjwWP/+/esBwGAw4LHHHjtz9dVX19ryWdTAUO3GOro7TXs79HQ0wKzzpSGPPtrq0G9L1+rqLjrxN45G4i23dWroV61QdwYpoSlw17tDL/Rw07khJTSl/RdRj3fhcJnP6X/sN1VuO2Y8/Y/9JluDdcKECRXHjx93j46OHjxlypTIf//7374dfe3nn3/e6/bbbz/X2vNPPvnkyY8//jjwzJkz+qbHDx8+7DVixIgaW9ptLxz+VUnm0bIurdu0l86W3Wtr+UxnhjvbW/Jy+bWCPIydLg94+RBywqhb2jy/KWeaD7VVUp8kLBuzjHOq1Cm1h8/5wSJ1kAAsUld7+Jyf54CALtcA9vf3V3766acDmzZt8vvmm2/8Hnjggavmz59/7NFHHz3T2mumTZsWc/78eZ2iKMjIyGh1uCgwMFCZNGnSmYULF/bx8vJqcZeaXbt2eU2bNq1/dXW1bv78+cVpaWmabmDOnqoKrLVwX9t8EL9ZvgOZR7XflF7tAvFqDnc2vVZX2mlLb9Ma6tffO6VbLIVJ6pOEGYkzGKjUYR4DeldCLxQIAHqheAzobfPWbwaDAXfeeWflokWLjr/yyiuFn376aZs9i1WrVhUUFhbmTJgw4WxaWlpkW+c+88wzJ9asWRNcXV3dmFcDBgw4v337dm8ASE1NPZ+Xl3fg5ptvrjh//rzmmaZ5A7qDy2vh7iho9Q80h7FldxpH6ko7uzqfauUM86FEWvEcEFAd/GBCvt+ofsXBDybk29JLBYC9e/d65OTkeFgfZ2VlefXr16+uvdd5eHjIRYsWFWdnZ/vs2bPHs7XzQkNDLePGjStbs2ZNsPXYU089VTp37tx+//3vf92sx6xbw2mNw78qUKMWrto6UiBeq3Wel69l7Wwhe2e5+5bIVXkOCKi2NUytKioq9I8++mhkRUWFXq/Xy+jo6Nr09PSjHXmtr6+vfPjhh08sXLgwdN26da2+Zt68eaXp6ekh1seTJ08uP3nypOH2228faLFYRK9evSyxsbHnx48fr/lGyNz6TSXONqfans6u8+xqAF8eoJevZQ19Zi4s58pZy5eoA7j1m3Pg1m8O0NVauFrpzDrPrhZaaKkYRLO1rHV1KH3xL4CiqL4/KhGRFjin2kN1Zl6yqzcGtVgMoslaVuh0gMXS7PmuOp6fi52frOvyulMiIjWwp9pDdWZesqvLUFoqnt+09q++tz9OvLTQ5v1RWeCeejhFURSh0+kcP5fXAymKIgC0uLwHYKj2aO2VAmx6XlduDGqteH7TtaweJpPN+6OywD31cD+dOnUqPiQkpJzBal+KoohTp075A/iptXMYqtQhHQ3gy7VXDEKN/VG7U0EHos4ym80zSktLl5eWlg4Gp/TsTQHwk9lsntHaCTbf/SuEiACwCkDYxTdcKqX8W1uv6Y53/5K2uGk49QQt3f1LzkWNnqoZwGwp5R4hhB+ATCHEV1LKAypcm6hDutqTJiJSk81DBVLKEinlnov/XQkgF4DR1utSz9HVgvquIvNoGZZ8e9gpylcSkX2pOqcqhIgGkAxgZwvPzQQwEwAiI9ss9Ug9SGcL/7saa13oOrMCd4MOH8wY7lLrmYmoc1Sb1BZC+AL4CMBjUsoruhxSyqVSyhQpZUpISMiVFyCnZq/epNqF/52NM9aFJiL7UaWnKoRwQ0OgfiCl/FiNa5LzsGdv0lpQ33ptZy3831XOWBeaiOzH5lAVQggA7wHIlVK+bnuTyNm01JtULVQ7UPjflQ2NCsAHM4a7VF1oIuo6NXqq1wOYCiBHCJF98dj/Sim/UOHa3VJpQTmK88tgNAW0ujG4M7F3b9Ijqle3C9OmXK0uNBF1nc2hKqX8HoBT7GPnCkoLyvHZoixYzAr0Bh3GP57s9MHa3XuTRERqYUUlByvOL4PFrEBKwGJRUJxf5vShCnT/3iSR2rJPZiPjRAZSQlOQ1CdJ6+aQg7hkqLra8GlTRlMA9AYdLBYFer0ORhOHBYm6m+yT2UjbnIY6Sx3c9e5YNmYZg7WHcLlQdcXh06bCYvwx/vFkl/2joLtztc3myTllnMhAnaUOChTUK/XIOJHBUO0hXC5UXXX4tKmwGH+Xa3NPwEINpJaU0BS4691Rr9TDTeeGlFCW6+0pXC5UOXxK9tJSoQaGKnVFUp8kLBuzjHOqPZDLhSqHT8leWKiB1JTUJ4lh2gO5XKgCHD4l+2ChBiKylUuGKpG9sFADEdmCu8QTEdlZTVYWTr+7FDVZWVo3heyMPVVyCa68Npl6tpqsLBROfwiyrg7C3R2RK1fAOzlZ62aRnbhcT7W0oByZm46gtJttEUats65N3vlZAT5blMX/7cml1OzaDVlXBygK5IULKJn3J/ZYuzGXClX+cu2ZWlqbTOQqvFOHAbpLv2rrCgpwdOo0Bms35VKhmrejBOb6hl+uZjN/ufYU1rXJQgeuTSaX452cDENoaPODZjNqdu3WpkFkVy4zp1paUI7c749fOiABTx837RrUA2lVwo9rk8nV6Tw8rjjmnTpMg5aQvblMqBbnl0FRmh+7UF2vTWN6IK1L+HFtcufx5i7n4XvLzThbUND4OHDGb3mzUjflMqFqNAVArxewWCQAQG8QHAZ0IJbwcy3W+w/MZgU6IXDj/SYkjDRq3aweqSYrC2Xvf9DwQKdD4EPTETpnjraNIrtxmVANi/HHhNnXIG9HCQAgdnhf/vXtQCzh51qK88tgNiuABBQp8Z9/5iPI6MufGQ003v0LAEJA78d9ibszlwlVgEOAWmIJP9diNAVAJwQU2TCyo0jpkjs6dQfeqcMg3N0h6+sh3Nw4l9rNuVSokrZYws91hMX448b7TfjPP/OhSAmDgXdNa8U7ORmRK1egZtdueKcO41xqN8dQJeqmEkYaEWT05c1KTsA7OZlh2kMwVIm6MU6ZEDmWSxV/ICIicmaqhKoQYqwQ4qAQ4rAQYq4a1yQiInI1NoeqEEIPYAmA2wHEA7hfCBFv63WJiIhcjRo91VQAh6WUBVLKOgD/AjBehesSERG5FDVC1QigqMnjYxePNSOEmCmEyBBCZJw6dUqFtyUiInIuaoSqaOGYvOKAlEullClSypSQkBAV3paIiMi5qBGqxwBENHncD8DxVs4lIiLqttQI1d0ABgoh+gsh3AHcB2CDCtclIiJyKTYXf5BSmoUQjwD4EoAewAop5X6bW0ZERORiVKmoJKX8AsAXalyLiIjIVbGiEhERkUoYqkRERCphqBIREamEoUpERKQShioREZFKuJ+qE9v+8WEUZJ9ETFIfjLhngNbNISKidjBUndT2jw8ja3MhADT+y2AlInJuHP51UgXZJ9t8TEREzoeh6qRikvq0+ZiIiJwPh3+dlHWol3OqRESug6HqxEbcM4BhSkTkQjj8S0REpBKGKhERkUoYqkRERCphqBIREamEoUpERKQShioREZFKGKpEREQqYagSERGphKFKRESkEoYqERGRShiqREREKrEpVIUQrwgh8oQQ+4QQnwgheqvVMCIiIldja0/1KwCDpZRDAOQDeMb2JhHZV2lBObauycPWNXkoLSjXujlE1I3YtEuNlHJzk4c7APzKtuYQ2VdpQTk+fX0PLGYJAMj7oQQTZl+DsBh/jVvm+mqyslCzazf0vf1hOVcO79Rh8E5O1rpZRA6l5tZvDwFYq+L1iFRXnF/WGKgAYFEkivPLGKpdYA1R79RhAIDC6Q9B1tUBigIIAeHhgciVKxis1KO0G6pCiK8BhLXw1Dwp5WcXz5kHwAzggzauMxPATACIjIzsUmOJbGU0BUBvEI3BqtcJGE0BGrfK9dRkZTWGqHB3h//48ZcCFQCkhKyvbwhdhir1IO2GqpTyF209L4R4AMCdAEZLKWVr50kplwJYCgApKSmtnkdkT2Ex/pjwxDXI21ECAIgd3pe91C6o2bW7MURlfT0AQLi7Nw9WRYGlskLDVhI5nk3Dv0KIsQCeBjBKSlmjTpOI7Cssxp9BaiPv1GENIVpfD+HmBv8J4+E/YTzKP/0M59avbwhWKXF2+Xtwj4hAwOTJWjeZyCFsvft3MQA/AF8JIbKFEO+o0CYicnLeycmIXLkCIY8+2jhv2jjMa+2pXlS5+SsNWkikDVvv/h2gVkOIyLU0C1I0zLOWf/LJFed5xMU6sllEmmJFJSJSRc2u3ZBmc/ODQkDv10ubBhFpgKFKRKrQ9/ZvPvR7cVmNdclNUzVZWTj97lLUZGU5sIVE9qfmOlUi6sEs58oBIQApASHgM2IEgh+ZdcWSmsuX43AtK3Un7KkSkSq8U4dBeHgAej2Eh0eLgQpcuRynZtduDVpLZB/sqRKRKqx3BFurLLXW+7x8OU5Lw8NErkq0Ua/BblJSUmRGRobD35eInEPTEocc+u04IUSmlDJF63ZQ69hTJXIhx/NzUbQ/BxEJiQg3xWndnC67fDkOUXfBUCVyAcfzc7F/2xbs3/o1FMUCvcGASc8ucOlgJeqOGKpETu54fi7WvzgP5vr6hjtrAVjMZhTtz2GoEjkZ3v1L5OSK9ufAYjY3BiqEgN5gQERCorYNI6IrsKdK5OQiEhKhNxhgMZuh0+mQcNOtSBh1C3upRE6IoUrkALbcYBRuisOkZxd0ixuUiLo7hiqRnVnnRC1mc5dvMAo3xTFMiVwA51SJ7Mw6JyoVpfEGIyLqnhiqRHZmnRMVOh1vMCLq5jj8S2RnnBMl6jkYqkQOwDlRop6BoUrkwkoLylGcXwajKQBhMf7IPFqGT3N/gMH7Z4yPHYmkPklaN5GoR2GoErmo0oJyfLYoCxazAr1Bh0H3DcDDWzdCH74UEGZsOJqO925bzmAlciDeqETkoorzy2AxK5ASsFgU7M8+AcX9MCDMEEKiXqlHxgnuBkXkSAxVIhdlNAVAb9BB6AC9XoeEpFDo6gYA0gApBdx0bkgJbdglLPtkNpbnLEf2yWyNW03UvXH4l8hFhcX4Y/zjyc3mVN/vNxmf5vZrNqeafTIbaZvTUGepg0FnwIQBEzDuqnEcFiayA25STtTNLc9Zjrf2vAUFCgBAQMBD74FlY5YxWF0MNyl3fqoM/woh5gghpBAiWI3rEZF6UkJT4K53h4AAAEhwvpXIXmwOVSFEBIBbARTa3hwiUltSnyQsG7MMvzL9Cu46d+iFvtl8KxGpR4051UUAngLwmQrXIiI7SOqThKQ+SbjrqruQcSIDKaEpHPolsgObQlUIcReAYinlXiFEe+fOBDATACIjI215WyLqImu4EpF9tBuqQoivAYS18NQ8AP8LYExH3khKuRTAUqDhRqVOtJGIiMgltBuqUspftHRcCJEIoD8Aay+1H4A9QohUKWWpqq0kIiJyAV0e/pVS5gDoY30shDgCIEVKeVqFdhEREbkcVlQiIiJSiWoVlaSU0Wpdi4iIyBWxp0pERKQSTcoUCiFOATjq8De+JBgA534b8Lu4hN9Fc/w+LnGW7yJKShmidSOodZqEqtaEEBmsn9mA38Ul/C6a4/dxCb8L6igO/xIREamEoUpERKSSnhqqS7VugBPhd3EJv4vm+H1cwu+COqRHzqkSERHZQ0/tqRIREamOoUpERKSSHh+qQog5QggphAjWui1aEUK8IoTIE0LsE0J8IoTorXWbHE0IMVYIcVAIcVgIMVfr9mhFCBEhhPhWCJErhNgvhPij1m3SmhBCL4TIEkJs1Lot5Px6dKgKISIA3AqgUOu2aOwrAIOllEMA5AN4RuP2OJQQQg9gCYDbAcQDuF8IEa9tqzRjBjBbShkHYDiAWT34u7D6I4BcrRtBrqFHhyqARQCeAtCj79aSUm6WUpovPtyBhm38epJUAIellAVSyjoA/wIwXuM2aUJKWSKl3HPxvyvRECZGbVulHSFEPwB3AFiudVvINfTYUBVC3AWgWEq5V+u2OJmHAPw/rRvhYEYARU0eH0MPDhIrIUQ0gGQAO7VtiabeQMMf3orWDSHXoNouNc5ICPE1gLAWnpoH4H8BjHFsi7TT1nchpfzs4jnz0DD894Ej2+YERAvHevTohRDCF8BHAB6TUlZo3R4tCCHuBHBSSpkphLhJ6/aQa+jWoSql/EVLx4UQiQD6A9grhAAahjv3CCFSpZSlDmyiw7T2XVgJIR4AcCeA0bLnLV4+BiCiyeN+AI5r1BbNCSHc0BCoH0gpP9a6PRq6HsBdQohfAvAE0EsI8b6UcorG7SInxuIPAIQQRwCkSCmdYRcKhxNCjAXwOoBRUspTWrfH0YQQBjTcoDUaQDGA3QB+LaXcr2nDNCAa/spMB3BWSvmY1u1xFhd7qnOklHdq3RZybj12TpWaWQzAD8BXQohsIcQ7WjfIkS7epPUIgC/RcGPOup4YqBddD2AqgFsu/n8h+2JPjYg6gD1VIiIilbCnSkREpBKGKhERkUoYqkRERCphqBIREamEoUpERKQShioREZFKGKpEREQq+f/oMbvoMgYzqQAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "from matplotlib import cm\n",
    "import matplotlib.pyplot as plt\n",
    "from sklearn.utils import check_random_state\n",
    "from sklearn.manifold import TSNE\n",
    "\n",
    "dataset_id = {}\n",
    "for rel_name, i in rel2id.items():\n",
    "    rel_key = rel_name.split('::')[0]\n",
    "    if dataset_id.get(rel_key, None) is None:\n",
    "        dataset_id[rel_key] = []\n",
    "    dataset_id[rel_key].append(i)\n",
    "\n",
    "X_embedded = TSNE(n_components=2).fit_transform(rel_emb).T\n",
    "p = cm.rainbow(int(255/2 * 1))\n",
    "fig = plt.figure()\n",
    "ax = fig.add_subplot(111)\n",
    "for key, val in dataset_id.items():\n",
    "    val = np.asarray(val, dtype=np.long)\n",
    "\n",
    "    ax.plot(X_embedded[0][val], X_embedded[1][val], '.', label=key)\n",
    "\n",
    "lgd = ax.legend(bbox_to_anchor=(1.0, 1.0))\n",
    "plt.savefig('relation.eps', bbox_extra_artists=(lgd,), bbox_inches='tight', format='eps')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pair-wise Relation Embedding Cosine Similarity\n",
    "We calculate the pair-wise embedding similarity using cosine distance and output the top10 most similar pairs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('GNBR::E::Compound:Gene', 'GNBR::K::Compound:Gene', 0.98599356),\n",
       " ('GNBR::E::Compound:Gene', 'GNBR::E+::Compound:Gene', 0.98297095),\n",
       " ('GNBR::N::Compound:Gene', 'GNBR::E-::Compound:Gene', 0.96987224),\n",
       " ('GNBR::E::Compound:Gene', 'GNBR::E-::Compound:Gene', 0.96532124),\n",
       " ('GNBR::K::Compound:Gene', 'GNBR::E+::Compound:Gene', 0.9564862),\n",
       " ('GNBR::E+::Compound:Gene', 'GNBR::E-::Compound:Gene', 0.95019233),\n",
       " ('GNBR::L::Gene:Disease', 'GNBR::G::Gene:Disease', 0.9419448),\n",
       " ('GNBR::K::Compound:Gene', 'GNBR::E-::Compound:Gene', 0.94074607),\n",
       " ('GNBR::J::Gene:Disease', 'GNBR::Md::Gene:Disease', 0.9319676),\n",
       " ('GNBR::J::Gene:Disease', 'GNBR::Te::Gene:Disease', 0.93183714)]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.metrics.pairwise import cosine_similarity\n",
    "\n",
    "similarity = cosine_similarity(rel_emb)\n",
    "idx = np.flip(np.argsort(similarity), axis=1)\n",
    "\n",
    "max_pairs = []\n",
    "for i in range(idx.shape[0]):\n",
    "    j = 1\n",
    "    while (similarity[i][idx[i][j]] > 0.9):\n",
    "        max_pairs.append((id2rel[idx[i][0]], id2rel[idx[i][j]], similarity[i][idx[i][j]]))\n",
    "        j += 1\n",
    "\n",
    "def sort_score(pair):\n",
    "    return pair[2]\n",
    "\n",
    "max_pairs.sort(reverse=True, key=sort_score)\n",
    "sim_pairs = []\n",
    "for i, pair in enumerate(max_pairs):\n",
    "    if i % 2 == 0:\n",
    "        sim_pairs.append(pair)\n",
    "\n",
    "sim_pairs[:10]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we draw a histogram of how the pair-wise similarity score distributed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(11449,)\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEGCAYAAACUzrmNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAdEUlEQVR4nO3de7gcVZnv8e+PcFWCCRIwBDCBiTPCiAE2lxFGbnJ1JKCAgXGIHDQ6Awpn9ByDNxCGAY4CI4+IhiFyGSUgCEQFmRACqCOXREJCuJgNRNkkkii3CEOGhPf8UatNsdO7q7LT1d3Z+/d5nn66atWqrnfXTvbbVWvVWooIzMzMGtmg3QGYmVnnc7IwM7NCThZmZlbIycLMzAo5WZiZWaEN2x1AFbbaaqsYPXp0u8MwM1uvzJkz5w8RMaLetgGZLEaPHs3s2bPbHYaZ2XpF0m/72ubbUGZmVsjJwszMCjlZmJlZIScLMzMr5GRhZmaFnCzMzKyQk4WZmRVysjAzs0JOFmZmVmhAPsFtVmT05J+27diLLvhg245t1l++sjAzs0KVJQtJm0p6QNLDkhZI+loqHyPpfkkLJV0vaeNUvkla707bR+c+68xU/oSkw6qK2czM6qvyymIFcFBEvBcYBxwuaR/gQuCSiBgLvACckuqfArwQEX8BXJLqIWlnYAKwC3A48G1JQyqM28zMeqksWUTmT2l1o/QK4CDgxlR+NXB0Wh6f1knbD5akVD4tIlZExNNAN7BXVXGbmdmaKm2zkDRE0lxgKTADeBJ4MSJWpio9wKi0PAp4BiBtfwl4e768zj75Y02SNFvS7GXLllXx45iZDVqVJouIWBUR44DtyK4G3l2vWnpXH9v6Ku99rCkR0RURXSNG1J27w8zM+qklvaEi4kXgbmAfYJikWpfd7YDFabkH2B4gbX8b8Hy+vM4+ZmbWAlX2hhohaVha3gz4APAYMAs4NlWbCNyalqenddL2uyIiUvmE1FtqDDAWeKCquM3MbE1VPpQ3Erg69VzaALghIn4i6VFgmqR/AR4Crkz1rwSuldRNdkUxASAiFki6AXgUWAmcGhGrKozbzMx6qSxZRMQ8YLc65U9RpzdTRLwGHNfHZ50HnNfsGM3MrBw/wW1mZoWcLMzMrJCThZmZFXKyMDOzQk4WZmZWyMnCzMwKOVmYmVkhJwszMyvkZGFmZoWcLMzMrJCThZmZFXKyMDOzQk4WZmZWyMnCzMwKOVmYmVkhJwszMyvkZGFmZoWcLMzMrJCThZmZFXKyMDOzQk4WZmZWyMnCzMwKOVmYmVkhJwszMytUWbKQtL2kWZIek7RA0ump/GxJz0qam15H5vY5U1K3pCckHZYrPzyVdUuaXFXMZmZW34YVfvZK4HMR8WtJQ4E5kmakbZdExDfylSXtDEwAdgG2Be6U9K60+TLgEKAHeFDS9Ih4tMLYzcwsp7JkERFLgCVpebmkx4BRDXYZD0yLiBXA05K6gb3Stu6IeApA0rRU18nCzKxFWtJmIWk0sBtwfyo6TdI8SVMlDU9lo4Bncrv1pLK+ynsfY5Kk2ZJmL1u2rMk/gZnZ4FaYLCTtJGmTtHyApM9KGlb2AJI2B24CzoiIl4HLgZ2AcWRXHhfVqtbZPRqUv7kgYkpEdEVE14gRI8qGZ2ZmJZS5srgJWCXpL4ArgTHAD8p8uKSN0v7fj4gfAUTEcxGxKiLeAK5g9a2mHmD73O7bAYsblJuZWYuUSRZvRMRK4Bjg3yLifwMji3aSJLLk8lhEXJwrz+97DPBIWp4OTJC0iaQxwFjgAeBBYKykMZI2JmsEn14ibjMza5IyDdyvSzoBmAh8KJVtVGK/fYF/AOZLmpvKvgicIGkc2a2kRcCnACJigaQbyBquVwKnRsQqAEmnAXcAQ4CpEbGgxPHNzKxJyiSLk4FPA+dFxNPpW/9/FO0UEb+gfnvDbQ32OQ84r075bY32MzOzajVMFpKGAF+MiI/VyiLiaeCCqgMzM7PO0bDNIt0GGpHaCszMbJAqcxtqEfBLSdOBV2qF+UZrMzMb2Moki8XptQEwtNpwzMysExUmi4j4WisCMTOzztVnspD0bxFxhqQfU/+J6aMqjczMzDpGoyuLa9P7NxrUMTOzQaDPZBERc9L7Pa0Lx8zMOlFhm4WkscD5wM7AprXyiNixwrjMzKyDlBkb6ntkI8WuBA4ErmH1LSozMxsEyiSLzSJiJqCI+G1EnA0cVG1YZmbWSco8Z/GapA2AhWlAv2eBrasNy8zMOkmZK4szgLcAnwX2IBtJdmKVQZmZWWcp81DegwDp6uKzEbG88qjMzKyjlJlWtUvSfGAe2dwUD0vao/rQzMysU5Rps5gK/FNE/BxA0n5kPaR2rTIwMzPrHGXaLJbXEgX8eVIj34oyMxtEylxZPCDpu8B1ZGNEfRS4W9LuABHx6wrjMzOzDlAmWYxL72f1Kn8fWfLwMxdmZgNcmd5QB7YiEDMz61xl2izMzGyQc7IwM7NCThZmZlaoTAM3kt4HjM7Xj4hrKorJzMw6TJn5LK4FdgLmAqtScZANVW5mZoNAmSuLLmDniFhjHu5GJG1PllDeAbwBTImIb0raErie7EplEXB8RLwgScA3gSOBV4GP157hkDQR+HL66H+JiKvXJhYzM1s3ZdosHiH7g7+2VgKfi4h3A/sAp0raGZgMzIyIscDMtA5wBDA2vSaRTbhESi5nAXsDewFnSRrej3jMzKyfylxZbAU8KukBYEWtMCKOarRTRCwBlqTl5ZIeA0YB44EDUrWrgbuBL6Tya9IVzH2ShkkamerOiIjnASTNAA4ne6LczMxaoEyyOHtdDyJpNLAbcD+wTUokRMQSSbWJlEYBz+R260llfZX3PsYksisSdthhh3UN2czMcgpvQ0XEPcDjwND0eiyVlSJpc+Am4IyIeLlR1XqHb1DeO84pEdEVEV0jRowoG56ZmZVQZj6L44EHgOOA44H7JR1b5sMlbUSWKL4fET9Kxc+l20uk96WpvAfYPrf7dsDiBuVmZtYiZRq4vwTsGRETI+IkskbmrxTtlHo3XUl2JXJxbtN0Vk/LOhG4NVd+kjL7AC+l21V3AIdKGp4atg9NZWZm1iJl2iw2iIilufU/Ui7J7Es2X/d8SXNT2ReBC4AbJJ0C/I7sigXgNrJus91kXWdPBoiI5yWdCzyY6p1Ta+w2M7PWKJMsfibpDlb3Pvoo2R/2htIkSfXaGwAOrlM/gFP7+KypZDP2mZlZG5QZovz/SPoI2ZWCyB6uu7nyyMzMrGOUGhsqIm4ia6g2M7NBqM9kIekXEbGfpOW8uauqyO4abVF5dGZm1hH6TBYRsV96H9q6cMzMrBOVec7i2jJlZmY2cJXpArtLfkXShsAe1YRjZmadqM9kIenM1F6xq6SX02s58ByrH6QzM7NBoM9kERHnp/aKr0fEFuk1NCLeHhFntjBGMzNrszLPWZyZhtkYC2yaK7+3ysDMzKxzlJlW9RPA6WQD+M0lm8joV8BB1YZmZmadokwD9+nAnsBvI+JAsnkpllUalZmZdZQyyeK1iHgNQNImEfE48JfVhmVmZp2kzHAfPZKGAbcAMyS9gOeTMDMbVMo0cB+TFs+WNAt4G/CzSqMyM7OO0mhsqC3rFM9P75sDnlPCzGyQaHRlMYc158CurQewY4VxmZlZB2k0kOCYVgZiZmadq8xAgpL0MUlfSes7SNqr+tDMzKxTlOk6+23gb4AT0/py4LLKIjIzs45Tpuvs3hGxu6SHACLiBUkbVxyXmZl1kDJXFq9LGkKaLU/SCOCNSqMyM7OOUiZZXArcDGwt6TzgF8C/VhqVmZl1lDIP5X1f0hzgYLJus0dHxGOVR2ZmZh2jYbKQtAEwLyL+Gni8NSGZmVmnaXgbKiLeAB6WtEOL4jEzsw5Ups1iJLBA0kxJ02uvop0kTZW0VNIjubKzJT0raW56HZnbdqakbklPSDosV354KuuWNHltf0AzM1t3ZbrOfq2fn30V8C3gml7ll0TEN/IFknYGJgC7ANsCd0p6V9p8GXAI0AM8KGl6RDzaz5jMzKwfyjRw39OfD46IeyWNLll9PDAtIlYAT0vqBmpPiXdHxFMAkqaluk4WZmYtVOY2VLOdJmleuk01PJWNAp7J1elJZX2Vr0HSJEmzJc1etswT+ZmZNVOrk8XlwE7AOGAJcFEqV526vUe8zZevWRgxJSK6IqJrxIgRzYjVzMySPpOFpJnp/cJmHSwinouIVamX1RWsvtXUA2yfq7od2Wx8fZWbmVkLNWqzGClpf+Co1Fbwpm/5EfHrtT2YpJERsSStHgPUekpNB34g6WKyBu6xwAPpmGMljQGeJWsEPxEzM2upRsniq8Bksm/zF/faFsBBjT5Y0nXAAcBWknqAs4ADJI1L+y8CPgUQEQsk3UDWcL0SODUiVqXPOQ24AxgCTI2IBWvx85mZWRM0mvzoRuBGSV+JiHPX9oMj4oQ6xVc2qH8ecF6d8tuA29b2+GZm1jxlus6eK+ko4P2p6O6I+Em1YZmZWScpM1Pe+cDpZLeIHgVOT2VmZjZIlHmC+4PAuNSDCUlXAw8BZ1YZmJmZdY6yz1kMyy2/rYpAzMysc5W5sjgfeEjSLLKurO/HVxVmZoNKmQbu6yTdDexJliy+EBG/rzowMzPrHGWuLEgP0hUOS25mZgNTOwYSNDOz9YyThZmZFWqYLCRtkJ/pzszMBifPwW1mZoXKNHDX5uB+AHilVhgRR1UWlZmZdZQq5+A2M7MBotQc3JLeCYyNiDslvYVsuHAzMxskygwk+EngRuC7qWgUcEuVQZmZWWcp03X2VGBf4GWAiFgIbF1lUGZm1lnKJIsVEfE/tRVJG5LNdGdmZoNEmWRxj6QvAptJOgT4IfDjasMyM7NOUiZZTAaWAfPJ5sy+DfhylUGZmVlnKdMb6o004dH9ZLefnogI34YyMxtECpOFpA8C3wGeJBuifIykT0XE7VUHZ2ZmnaHMQ3kXAQdGRDeApJ2AnwJOFmZmg0SZNoultUSRPAUsrSgeMzPrQH1eWUj6cFpcIOk24AayNovjgAdbEJuZmXWIRlcWH0qvTYHngP2BA8h6Rg0v+mBJUyUtzQ9xLmlLSTMkLUzvw1O5JF0qqVvSPEm75/aZmOovlDSxXz+lmZmtkz6vLCLi5HX87KuAbwHX5MomAzMj4gJJk9P6F4AjgLHptTdwObC3pC2Bs4AusquaOZKmR8QL6xibmZmthTK9ocYAnwFG5+sXDVEeEfdKGt2reDzZ1QnA1cDdZMliPHBN6pJ7n6RhkkamujMi4vkUywzgcOC6orjNzKx5yvSGugW4kuyp7TfW8XjbRMQSgIhYIqk2xtQo4JlcvZ5U1lf5GiRNAiYB7LCD52oyM2umMsnitYi4tOI4VKcsGpSvWRgxBZgC0NXV5YcGzcyaqEzX2W9KOkvS30javfbq5/GeS7eXSO+1Lrg9wPa5etsBixuUm5lZC5W5sngP8A/AQay+DRVpfW1NByYCF6T3W3Plp0maRtbA/VK6TXUH8K+1XlPAocCZ/TiumZmtgzLJ4hhgx/ww5WVIuo6sgXorST1kvZouAG6QdArwO7JnNiAbnPBIoBt4FTgZICKel3Quq5/rOKfW2G1mZq1TJlk8DAxjLZ/ajogT+th0cJ26QTbJUr3PmQpMXZtjm5lZc5VJFtsAj0t6EFhRKyzqOmtmZgNHmWRxVuVRmJlZRyszn8U9rQjEzMw6V5knuJez+tmGjYGNgFciYosqAzMzs85R5spiaH5d0tHAXpVFZGZmHafMQ3lvEhG30L9nLMzMbD1V5jbUh3OrG7B6BFgzMxskyvSG+lBueSWwiGyUWDMzGyTKtFms67wWZma2nms0repXG+wXEXFuBfGYmVkHanRl8UqdsrcCpwBvB5wszMwGiUbTql5UW5Y0FDidbIC/acBFfe1nZmYDT8M2izQH9j8Df082Derunv/azGzwadRm8XXgw2Szz70nIv7UsqjMzKyjNHoo73PAtsCXgcWSXk6v5ZJebk14ZmbWCRq1Waz1091mZjYwlXkoz8yaaPTkn7bluIsu+GBbjmsDg68ezMyskJOFmZkVcrIwM7NCThZmZlbIycLMzAo5WZiZWSF3nbW2alc3UjNbO76yMDOzQm1JFpIWSZovaa6k2alsS0kzJC1M78NTuSRdKqlb0jxJu7cjZjOzwaydVxYHRsS4iOhK65OBmRExFpiZ1gGOAMam1yTg8pZHamY2yHXSbajxZMOgk96PzpVfE5n7gGGSRrYjQDOzwapdySKA/5Q0R9KkVLZNRCwBSO9bp/JRwDO5fXtS2ZtImiRptqTZy5YtqzB0M7PBp129ofaNiMWStgZmSHq8QV3VKYs1CiKmkM29QVdX1xrbzcys/9qSLCJicXpfKulmYC/gOUkjI2JJus20NFXvAbbP7b4dsLilAZsNAO3spuwRb9d/Lb8NJemtaU5vJL0VOBR4BJgOTEzVJgK3puXpwEmpV9Q+wEu121VmZtYa7biy2Aa4WVLt+D+IiJ9JehC4QdIpwO+A41L924AjgW7gVeDk1odsZja4tTxZRMRTwHvrlP8ROLhOeQCntiA0MzPrQyd1nTUzsw7lZGFmZoWcLMzMrJCThZmZFXKyMDOzQk4WZmZWyMnCzMwKOVmYmVkhJwszMyvkZGFmZoXaNUS5dZh2jkhqZp3PVxZmZlbIycLMzAo5WZiZWSEnCzMzK+QGbjOrXLs6UHg61+Zxsugg7pFkZp3Kt6HMzKyQk4WZmRVysjAzs0JuszCzAWswtgNW1ajvKwszMyvkZGFmZoWcLMzMrJCThZmZFVpvkoWkwyU9Ialb0uR2x2NmNpisF72hJA0BLgMOAXqAByVNj4hHqzjeYOxBYWbWyPpyZbEX0B0RT0XE/wDTgPFtjsnMbNBYL64sgFHAM7n1HmDvfAVJk4BJafVPkp5oUWxrYyvgD+0OokCnx9jp8YFjbIZOjw86NEZd+OfF/sT3zr42rC/JQnXK4k0rEVOAKa0Jp38kzY6IrnbH0Uinx9jp8YFjbIZOjw86P8Zmx7e+3IbqAbbPrW8HLG5TLGZmg876kiweBMZKGiNpY2ACML3NMZmZDRrrxW2oiFgp6TTgDmAIMDUiFrQ5rP7o6NtkSafH2OnxgWNshk6PDzo/xqbGp4gormVmZoPa+nIbyszM2sjJwszMCjlZNJmkLSXNkLQwvQ+vU+dASXNzr9ckHZ22XSXp6dy2ce2IMdVblYtjeq58jKT70/7Xp04HLY1P0jhJv5K0QNI8SR/NbavsHBYNOyNpk3ROutM5Gp3bdmYqf0LSYc2KaS3j+2dJj6ZzNlPSO3Pb6v6+2xDjxyUty8Xyidy2ienfxUJJE9sU3yW52H4j6cXctsrPoaSpkpZKeqSP7ZJ0aYp/nqTdc9v6f/4iwq8mvoD/B0xOy5OBCwvqbwk8D7wlrV8FHNsJMQJ/6qP8BmBCWv4O8I+tjg94FzA2LW8LLAGGVXkOyTpXPAnsCGwMPAzs3KvOPwHfScsTgOvT8s6p/ibAmPQ5Q9oQ34G5f2v/WIuv0e+7DTF+HPhWnX23BJ5K78PT8vBWx9er/mfIOty08hy+H9gdeKSP7UcCt5M9n7YPcH8zzp+vLJpvPHB1Wr4aOLqg/rHA7RHxaqVRvdnaxvhnkgQcBNzYn/1LKowvIn4TEQvT8mJgKTCiyXH0VmbYmXzsNwIHp3M2HpgWESsi4mmgO31eS+OLiFm5f2v3kT2z1ErrMnTPYcCMiHg+Il4AZgCHtzm+E4DrmhxDQxFxL9kXzL6MB66JzH3AMEkjWcfz52TRfNtExBKA9L51Qf0JrPmP7bx0+XiJpE3aGOOmkmZLuq92mwx4O/BiRKxM6z1kw7G0Iz4AJO1F9i3wyVxxFeew3rAzvX/2P9dJ5+glsnNWZt9WxJd3Ctk30Jp6v+9mKxvjR9Lv70ZJtQdyO+ocplt4Y4C7csWtOIdF+voZ1un8rRfPWXQaSXcC76iz6Utr+TkjgfeQPT9Scybwe7I/flOALwDntCnGHSJisaQdgbskzQderlNvrftfN/kcXgtMjIg3UnFTzmG9w9Up6/2z91WnzL7rqvQxJH0M6AL2zxWv8fuOiCfr7V9xjD8GrouIFZI+TXaldlDJfVsRX80E4MaIWJUra8U5LFLJv0Eni36IiA/0tU3Sc5JGRsSS9IdsaYOPOh64OSJez332krS4QtL3gM+3K8Z0e4eIeErS3cBuwE1kl7Ubpm/O/Rp6pRnxSdoC+Cnw5XS5XfvsppzDOsoMO1Or0yNpQ+BtZLcMWjFkTaljSPoAWVLePyJW1Mr7+H03+w9dYYwR8cfc6hVAbWi8HuCAXvve3er4ciYAp+YLWnQOi/T1M6zT+fNtqOabDtR6GUwEbm1Qd437nemPY61t4Gigbo+HqmOUNLx2+0bSVsC+wKORtZTNImtr6XP/FsS3MXAz2b3ZH/baVtU5LDPsTD72Y4G70jmbDkxQ1ltqDDAWeKBJcZWOT9JuwHeBoyJiaa687u+7yfGVjXFkbvUo4LG0fAdwaIp1OHAob74qb0l8Kca/JGsk/lWurFXnsMh04KTUK2of4KX0BWrdzl/VLfeD7UV2f3omsDC9b5nKu4B/z9UbDTwLbNBr/7uA+WR/4P4D2LwdMQLvS3E8nN5Pye2/I9kfum7gh8AmbYjvY8DrwNzca1zV55Csp8lvyL4tfimVnUP2xxdg03ROutM52jG375fSfk8AR1T0768ovjuB53LnbHrR77sNMZ4PLEixzAL+Krfv/0rnths4uR3xpfWzgQt67deSc0j2BXNJ+vffQ9b29Gng02m7yCaLezLF0dWM8+fhPszMrJBvQ5mZWSEnCzMzK+RkYWZmhZwszMyskJOFmZkVcrKwAUXSOyRNk/SkstFVb5P0rn58zm2ShlUU47aSbiyu+aZ9zkkP0yHpbkld67D/GZLesjb7m7nrrA0Y6SG8/wKujojvpLJxwNCI+Hlbg2ui9GTw5yNidsn6QyI3JIWkRWR97/9QTYQ2EPnKwgaSA4HXa4kCICLmRsTP09OsX5f0iKT5SvNfSBop6V5l8w88IulvU/kiSVtJGi3pMUlXKJs74z8lbZbq7CTpZ5LmSPq5pL/qHZCk/bV6foOHJA1Nn/lI2v5xSbdI+rGyOThOUzbnxEPKBqPbMtW7StKxdT7/cmUD1y2Q9LVc+SJJX5X0C+C42v6SPks2pPssSbMknSLpktx+n5R0cXN+HTaQOFnYQPLXwJw+tn0YGAe8F/gA8PU0rMSJwB0RUds2t86+Y4HLImIX4EXgI6l8CvCZiNiDbPypb9fZ9/PAqenz/xb47z7iPpFseOzzgFcjYjeyoSROavgTZ08YdwG7AvtL2jW37bWI2C8iptUKIuJSsnGCDoyIA8mG4D5K0kapysnA9wqOaYOQBxK0wWI/spFMVwHPSboH2JNsLKCp6Y/lLRFRL1k8nSufA4yWtDnZ8A4/zO5+AdnERr39ErhY0veBH0VET65+zayIWA4sl/QS2airkA3VsGvvyr0cL2kS2f/lkWSTLM1L264v2JeIeEXSXcDfSXoM2Cgi5hftZ4OPryxsIFkA7NHHtnrDMxPZRDLvJxun61pJ9b7Jr8gtryL7w7wB2bwe43Kvd9f5/AuATwCbAffVu1XV6/PfyK2/QYMvdGlAws8DB0fErmQj8G6aq/JKX/v28u9ks9P5qsL65GRhA8ldwCaSPlkrkLSnpP2Be4GPShoiaQRZgnhA2QQ2SyPiCuBKsukqC0XEy8DTko5Lx5Gk9/auJ2mniJgfERcCs4F6yaK/tiBLCC9J2gY4ouR+y4GhtZWIuJ9sSOsTafGsb7b+cLKwASOyrn3HAIekrrMLyEYHXUw2nPk8shFB7wL+b0T8nmx8/7mSHiJri/jmWhzy74FTJD1MdlVTb/rNM1LD+cNk7RW316nTLxHxMPBQOvZUslteZUwBbpc0K1d2A/DLyKbbNFuDu86aGZJ+AlwSETPbHYt1Jl9ZmA1ikoZJ+g3w304U1oivLMzMrJCvLMzMrJCThZmZFXKyMDOzQk4WZmZWyMnCzMwK/X94hwsebRWmdwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "similarity=similarity.flatten()\n",
    "print(similarity.shape)\n",
    "\n",
    "# cleanup self-compare and dup-compare\n",
    "s = similarity < 0.99\n",
    "s = np.unique(similarity[s])\n",
    "plt.xlabel('Cosine similarity')\n",
    "plt.ylabel('Number of relation pairs')\n",
    "plt.hist(s)\n",
    "plt.savefig('relation-sim.eps', format='eps')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pair-wise Relation Embedding Frobenius Similarity\n",
    "We calculate the pair-wise embedding similarity using L2 distance and output the top10 most similar pairs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('GNBR::E::Compound:Gene', 'GNBR::K::Compound:Gene', 1.6984596),\n",
       " ('GNBR::E::Compound:Gene', 'GNBR::E+::Compound:Gene', 1.8601348),\n",
       " ('GNBR::N::Compound:Gene', 'GNBR::E-::Compound:Gene', 2.3698092),\n",
       " ('GNBR::E::Compound:Gene', 'GNBR::E-::Compound:Gene', 2.606229),\n",
       " ('GNBR::K::Compound:Gene', 'GNBR::E+::Compound:Gene', 2.9946468),\n",
       " ('GNBR::E+::Compound:Gene', 'GNBR::E-::Compound:Gene', 3.1560013),\n",
       " ('GNBR::L::Gene:Disease', 'GNBR::G::Gene:Disease', 3.4119256),\n",
       " ('GNBR::K::Compound:Gene', 'GNBR::E-::Compound:Gene', 3.454293),\n",
       " ('GNBR::J::Gene:Disease', 'GNBR::Md::Gene:Disease', 3.6071572),\n",
       " ('GNBR::J::Gene:Disease', 'GNBR::Te::Gene:Disease', 3.624401)]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.metrics.pairwise import euclidean_distances\n",
    "\n",
    "similarity = euclidean_distances(rel_emb)\n",
    "idx = np.argsort(similarity)\n",
    "\n",
    "min_pairs = []\n",
    "for i in range(idx.shape[0]):\n",
    "    j = 1\n",
    "    while (similarity[i][idx[i][j]] < 5):\n",
    "        min_pairs.append((id2rel[idx[i][0]], id2rel[idx[i][j]], similarity[i][idx[i][j]]))\n",
    "        j += 1\n",
    "\n",
    "def sort_score(pair):\n",
    "    return pair[2]\n",
    "\n",
    "min_pairs.sort(key=sort_score)\n",
    "sim_pairs = []\n",
    "for i, pair in enumerate(min_pairs):\n",
    "    if i % 2 == 0:\n",
    "        sim_pairs.append(pair)\n",
    "\n",
    "sim_pairs[:10]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we draw a histogram of how the pair-wise distance score distributed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(11449,)\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEGCAYAAACUzrmNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAeb0lEQVR4nO3dfZwcVZ3v8c+XAIoCEiSwMQkG2Ogu+BDDiCiKARR58BJgfQBXiMg16oULvNa9a0BXUJYLroK7rIoblqzAYhDFQNQoRK7CZS8hTCDkgYfNBKIMickovCCIogm/+0edhmLS3VUzmerumfm+X69+dfWpU1W/VDr9S9U5dY4iAjMzs2a2a3cAZmbW+ZwszMyskJOFmZkVcrIwM7NCThZmZlZo+3YHUJU99tgjJk+e3O4wzMyGjaVLl/4mIsbVWzdik8XkyZPp7u5udxhmZsOGpF82WufbUGZmVsjJwszMCjlZmJlZIScLMzMr5GRhZmaFnCzMzKyQk4WZmRVysjAzs0JOFmZmVmjEPsFt1qkmz/5xW4679pJj23JcGxl8ZWFmZoWcLMzMrJCThZmZFXKyMDOzQk4WZmZWyMnCzMwKOVmYmVkhJwszMytUWbKQNEnSzyU9KGmVpLNT+e6SFkland7HpnJJulxSj6Tlkqbl9jUz1V8taWZVMZuZWX1VXllsBj4TEX8JHAycIWl/YDZwW0RMAW5LnwGOBqak1yzgCsiSC3A+8DbgIOD8WoIxM7PWqCxZRMT6iLg3LW8CHgQmADOAq1O1q4Hj0/IM4JrILAZ2kzQeeB+wKCKeiIgngUXAUVXFbWZmW2tJm4WkycBbgLuBvSJiPWQJBdgzVZsAPJbbrDeVNSqvd5xZkroldff19Q3lH8HMbFSrPFlI2hm4ETgnIp5uVrVOWTQp37owYk5EdEVE17hx4wYerJmZ1VVpspC0A1miuC4ifpCKN6TbS6T3jam8F5iU23wisK5JuZmZtUiVvaEEXAU8GBGX5VYtAGo9mmYCN+fKT029og4Gnkq3qW4BjpQ0NjVsH5nKzMysRaqcz+IQ4BRghaRlqew84BLgBkmnA78CPpjWLQSOAXqAZ4HTACLiCUkXAvekel+KiCcqjNvMzPqpLFlExJ3Ub28AOKJO/QDOaLCvucDcoYvOzMwGwk9wm5lZIScLMzMr5GRhZmaFnCzMzKyQk4WZmRVysjAzs0JOFmZmVsjJwszMCjlZmJlZIScLMzMr5GRhZmaFnCzMzKyQk4WZmRVysjAzs0JOFmZmVqjKmfLmStooaWWu7LuSlqXX2tqkSJImS/p9bt23ctscKGmFpB5Jl6cZ+MzMrIWqnCnv28DXgWtqBRHx4dqypEuBp3L110TE1Dr7uQKYBSwmm03vKOAnFcRrZmYNVHZlERF3AHWnP01XBx8C5jXbh6TxwK4RcVeaSe8a4PihjtXMzJprV5vFu4ANEbE6V7aPpPsk3S7pXalsAtCbq9ObyuqSNEtSt6Tuvr6+oY/azGyUaleyOJmXXlWsB/aOiLcAfwN8R9Ku1J/DOxrtNCLmRERXRHSNGzduSAM2MxvNqmyzqEvS9sCJwIG1soh4DnguLS+VtAZ4HdmVxMTc5hOBda2L1szMoMSVhaT9JL0sLU+XdJak3bbhmO8BHoqIF24vSRonaUxa3heYAjwSEeuBTZIOTu0cpwI3b8OxzcxsEMrchroR2CLpz4GrgH2A7xRtJGkecBfwekm9kk5Pq05i64btQ4Hlku4Hvg98KiJqjeOfBv4N6AHW4J5QZmYtV+Y21PMRsVnSCcA/RcS/SLqvaKOIOLlB+cfqlN1IlpTq1e8G3lAiTjMzq0iZK4s/SToZmAn8KJXtUF1IZmbWacoki9OAtwMXRcSjkvYB/qPasMzMrJM0vQ2VGp3Pi4iP1soi4lHgkqoDMzOzztH0yiIitgDjJO3YonjMzKwDlWngXgv8p6QFwO9qhRFxWVVBmZlZZymTLNal13bALtWGY2ZmnagwWUTEF1sRiJmZda6GyULSP0XEOZJ+SJ3xmCLiuEojMzOzjtHsyuLa9P7VVgRiZmadq2GyiIil6f321oVjZmadqLDNQtIU4GJgf+DltfKI2LfCuMzMrIOUeYL738mmNt0MHEY2W921TbcwM7MRpUyy2CkibgMUEb+MiAuAw6sNy8zMOkmZ5yz+IGk7YLWkM4HHgT2rDcvMzDpJmSuLc4BXAGeRzW53CtkItGZmNkoUJouIuCcingGeBs6KiBMjYnHRdpLmStooaWWu7AJJj0tall7H5NadK6lH0sOS3pcrPyqV9UiaPfA/opmZbasy06p2SVoBLAdWSLpf0oFF2wHfBo6qU/61iJiaXgvTMfYnm0HvgLTNNyWNSaPefgM4mqw31smprpmZtVCZNou5wP+IiP8LIOmdZD2k3tRso4i4Q9LkknHMAK6PiOeARyX1AAeldT0R8Ug69vWp7gMl92tmZkOgTJvFplqiAIiIO4FN23DMMyUtT7epxqayCcBjuTq9qaxRuZmZtVCZZLFE0r9Kmi7p3ZK+CfxC0jRJ0wZ4vCuA/YCpwHrg0lSuOnWjSXldkmZJ6pbU3dfXN8DQzMyskTK3oaam9/P7lb+D7Ie79DMXEbGhtizpSl6c07sXmJSrOpFsWHSalNfb/xxgDkBXV1fDpGJmZgNTZojyw4bqYJLGR8T69PEEoNZTagHwHUmXAa8BpgBLyK4spqR5vx8nawT/yFDFY2Zm5ZS5shgUSfOA6cAeknrJrkymS5pKdkWyFvgkQESsknQDWcP1ZuCMNKUr6UHAW4AxwNyIWFVVzGZmVl9lySIiTq5TfFWT+hcBF9UpXwgsHMLQzMxsgMo0cJuZ2ShX6spC0juAyfn6EXFNRTGZmVmHKTOfxbVk3V2XAVtScZANVW5mZqNAmSuLLmD/iHBXVDOzUapMm8VK4M+qDsTMzDpXmSuLPYAHJC0BnqsVRsRxlUVlZmYdpUyyuKDqIMzMrLOVeYL7dkl7AW9NRUsiYmO1YZmZWScpM5/Fh8iG3vgg8CHgbkkfqDowMzPrHGVuQ30OeGvtakLSOOBnwPerDMzMzDpHmd5Q2/W77fTbktuZmdkIUebK4qeSbgHmpc8fxmM1mZmNKmUauP+XpL8CDiEbMnxORMyvPDIzM+sYpcaGiogbgRsrjsXMzDpUw2Qh6c6IeKekTbx0KlMBERG7Vh6dmZl1hIbJIiLemd53aV04ZmbWico8Z3FtmbI6deZK2ihpZa7sK5IekrRc0nxJu6XyyZJ+L2lZen0rt82BklZI6pF0uSSV/+OZmdlQKNMF9oD8B0nbAweW2O7bwFH9yhYBb4iINwH/BZybW7cmIqam16dy5VcAs8jm5Z5SZ59mZlaxhslC0rmpveJNkp5Or03ABuDmoh1HxB3AE/3Kbo2IzenjYmBis31IGg/sGhF3pSHSrwGOLzq2mZkNrYbJIiIuTu0VX4mIXdNrl4h4dUSc22i7Afg48JPc530k3SfpdknvSmUTgN5cnd5UVpekWZK6JXX39fUNQYhmZgblnrM4V9JYsltAL8+V3zHYg0r6HLAZuC4VrQf2jojfSjoQuEnSAWQ9r7YKqUmsc4A5AF1dXZ6sycxsiJSZVvW/A2eT3TJaBhwM3AUcPpgDSpoJvB84ojb7XkQ8R5orIyKWSloDvI7sSiJ/q2oisG4wxzUzs8Er08B9Ntnw5L+MiMOAtwCDuscj6Sjgs8BxEfFsrnycpDFpeV+yq5hHImI9sEnSwakX1KmUaC8xM7OhVeYJ7j9ExB8kIellEfGQpNcXbSRpHjAd2ENSL3A+We+nlwGLUg/Yxann06HAlyRtBrYAn4qIWuP4p8l6Vu1E1saRb+cwM7MWKJMsetPzEDeR/cg/SYlbQRFxcp3iqxrUbTicSER0A28oEaeZmVWkTAP3CWnxAkk/B14F/LTSqMzMrKM0Gxtq9zrFK9L7zvR7hsLMzEauZlcWS8m6qea7r9Y+B7BvhXGZmVkHaTaQ4D6tDMTMzDpXmYEEJemjkv4+fd5b0kHVh2ZmZp2izHMW3wTeDnwkfd4EfKOyiMzMrOOU6Tr7toiYJuk+gIh4UtKOFcdlZmYdpMyVxZ/S09UB2dPWwPOVRmVmZh2lTLK4HJgP7CnpIuBO4H9XGpWZmXWUMg/lXSdpKXAEWbfZ4yPiwcojMzOzjtE0WUjaDlgeEW8AHmpNSGZm1mma3oaKiOeB+yXt3aJ4zMysA5XpDTUeWCVpCfC7WmFEHFdZVGZm1lHKJIsvVh6FmZl1tDIN3Le3IhAzM+tcZbrODpqkuZI2SlqZK9td0iJJq9P72FQuSZdL6pG0XNK03DYzU/3VaVpWMzNroUqTBdkMd0f1K5sN3BYRU4Db0meAo8mmU50CzAKugBeGSj8feBtwEHB+LcGYmVlrNEwWkm5L718e7M4j4g62nvdiBnB1Wr4aOD5Xfk1kFgO7SRoPvA9YFBFPRMSTwCK2TkBmZlahZm0W4yW9GzhO0vW8dF4LIuLeQR5zr4hYn/axXtKeqXwC8FiuXm8qa1S+FUmzyK5K2Htv9/Y1MxsqzZLFF8huEU0ELuu3LoDDhzgW1SnrP/lSvnzrwog5wByArq6uunXMzGzgmk1+9H3g+5L+PiIuHMJjbpA0Pl1VjAc2pvJeYFKu3kRgXSqf3q/8F0MYj5mZFShs4I6ICyUdJ+mr6fX+bTzmAqDWo2kmcHOu/NTUK+pg4Kl0u+oW4EhJY1PD9pGpzMzMWqTwOQtJF5P1QrouFZ0t6ZCIOLfEtvPIrgr2kNRL1qvpEuAGSacDvwI+mKovBI4BeoBngdMAIuIJSRcC96R6X4qI/o3mZmZWoTJPcB8LTE3jRCHpauA+oDBZRMTJDVYdUaduAGc02M9cYG6JWM3MrAJln7PYLbf8qioCMTOzzlXmyuJi4D5JPyfrmXQoJa4qzMxs5CgzNtQ8Sb8A3kqWLD4bEb+uOjAzM+scZa4sSL2SFlQci5mZdaiqx4YyM7MRwMnCzMwKNU0WkrbLDy9uZmajk+fgNjOzQp6D28zMCnkObjMzK1RqDm5JrwWmRMTPJL0CGFN9aGZm1inKDCT4CbIJhXYH9iObeOhb1BnfyWw4mTz7x+0OwWzYKNN19gzgEOBpgIhYDezZdAszMxtRyiSL5yLij7UPkranwUx1ZmY2MpVJFrdLOg/YSdJ7ge8BP6w2LDMz6yRlksVsoA9YAXySbJKiz1cZlJmZdZYyvaGeTxMe3U12++nhNFHRoEh6PfDdXNG+wBfI5sz4BFliAjgvIhambc4FTge2AGdFhKdVNTNroTK9oY4l6/20hmyI8n0kfTIifjKYA0bEw8DUtO8xwOPAfLJpVL8WEV/td/z9gZOAA4DXAD+T9LqI2DKY45uZ2cCVeSjvUuCwiOgBkLQf8GNgUMminyOANRHxS0mN6swAro+I54BHJfWQzQl+1xAc38zMSijTZrGxliiSR4CNQ3T8k4B5uc9nSlouaa6ksalsAvBYrk5vKtuKpFmSuiV19/X11atiZmaD0DBZSDpR0olk40ItlPQxSTPJekLds60HlrQjcBxZ7yqAK8ge+psKrCe7ooHs1ld/ddtMImJORHRFRNe4ceO2NUQzM0ua3Yb6b7nlDcC703IfMHbr6gN2NHBvRGwAqL0DSLoS+FH62AtMym03EVg3BMc3M7OSGiaLiDit4mOfTO4WlKTxafpWgBOA2jwaC4DvSLqMrIF7CrCk4tjMbAi1a2iVtZcc25bjjkRlekPtA/xPYHK+/rYMUZ4GI3wv2XMbNf8oaSrZLaa1tXURsUrSDcADwGbgDPeEMjNrrTK9oW4CriJrq3h+KA4aEc8Cr+5XdkqT+hcBFw3Fsc3MbODKJIs/RMTllUdiZmYdq0yy+GdJ5wO3As/VCiPi3sqiMrMh5yHZbVuUSRZvBE4BDufF21CRPpuZ2ShQJlmcAOybH6bczMxGlzJPcN9PNsifmZmNUmWuLPYCHpJ0Dy9tsxh011kzMxteyiSL8yuPwszMOlqZ+Sxub0UgZmbWuco8wb2JFwfu2xHYAfhdROxaZWBmZtY5ylxZ7JL/LOl4svkkzMxslCjTG+olIuIm/IyFmdmoUuY21Im5j9sBXTSYT8LMzEamMr2h8vNabCYbEXZGJdGYmVlHKtNmUfW8FmZm1uEaJgtJX2iyXUTEhRXEY2ZmHahZA/fv6rwATgc+u60HlrRW0gpJyyR1p7LdJS2StDq9j03lknS5pB5JyyVN29bjm5lZeQ2TRURcWnsBc4CdgNOA64F9h+j4h0XE1IjoSp9nA7dFxBTgtvQZsvm6p6TXLOCKITq+mZmV0LTrbPqf/j8Ay8luWU2LiM9GxMaK4pkBXJ2WrwaOz5VfE5nFwG6SxlcUg5mZ9dMwWUj6CnAPsAl4Y0RcEBFPDuGxA7hV0lJJs1LZXhGxHiC975nKJwCP5bbtTWX9Y54lqVtSd19f3xCGamY2ujXrDfUZslFmPw98TlKtXGQN3Ns63MchEbFO0p7AIkkPNamrOmVbPesREXPIbpnR1dXlZ0HMRrl2zg649pJj23bsKjRMFhEx4Ke7ByIi1qX3jZLmkw0hskHS+IhYn24z1W539QKTcptPBNZVGZ+Zmb2o0oTQiKRXStqltgwcCawEFgAzU7WZwM1peQFwauoVdTDwVO12lZmZVa/ME9xV2AuYn25tbQ98JyJ+miZYukHS6cCvgA+m+guBY4Ae4FmyXllmZtYibUkWEfEI8OY65b8FjqhTHsAZLQjNzMzqaMttKDMzG16cLMzMrJCThZmZFXKyMDOzQk4WZmZWyMnCzMwKOVmYmVkhJwszMyvkZGFmZoXaNdyHdZh2jc450kbmNBupfGVhZmaFnCzMzKyQk4WZmRVysjAzs0JOFmZmVqjlyULSJEk/l/SgpFWSzk7lF0h6XNKy9Domt825knokPSzpfa2O2cxstGtH19nNwGci4t40tepSSYvSuq9FxFfzlSXtD5wEHAC8BviZpNdFxJaWRm1mNoq1PFmkubPXp+VNkh4EJjTZZAZwfUQ8BzwqqQc4CLir8mCtcu16vsPMBqatbRaSJgNvAe5ORWdKWi5prqSxqWwC8Fhus14aJBdJsyR1S+ru6+urKGozs9GnbclC0s7AjcA5EfE0cAWwHzCV7Mrj0lrVOptHvX1GxJyI6IqIrnHjxlUQtZnZ6NSWZCFpB7JEcV1E/AAgIjZExJaIeB64kuxWE2RXEpNym08E1rUyXjOz0a4dvaEEXAU8GBGX5crH56qdAKxMywuAkyS9TNI+wBRgSaviNTOz9vSGOgQ4BVghaVkqOw84WdJUsltMa4FPAkTEKkk3AA+Q9aQ6wz2hzMxaqx29oe6kfjvEwibbXARcVFlQZmbWlJ/gNjOzQk4WZmZWyMnCzMwKOVmYmVkhJwszMyvkZGFmZoWcLMzMrFA7HsozMxvx2jWi8tpLjq1kv76yMDOzQk4WZmZWyLehOognAjKzTuUrCzMzK+RkYWZmhZwszMyskJOFmZkVcrIwM7NCwyZZSDpK0sOSeiTNbnc8ZmajybBIFpLGAN8Ajgb2J5uCdf/2RmVmNnoMl+csDgJ6IuIRAEnXAzPI5uUecn7ewczspYZLspgAPJb73Au8rX8lSbOAWenjM5IebkFsNXsAv2nh8QZrOMQ5HGKE4RHncIgRhkecwyFG9OVtivO1jVYMl2ShOmWxVUHEHGBO9eFsTVJ3RHS149gDMRziHA4xwvCIczjECMMjzuEQI1QX57BosyC7kpiU+zwRWNemWMzMRp3hkizuAaZI2kfSjsBJwII2x2RmNmoMi9tQEbFZ0pnALcAYYG5ErGpzWP215fbXIAyHOIdDjDA84hwOMcLwiHM4xAgVxamIrW79m5mZvcRwuQ1lZmZt5GRhZmaFnCwGQNIkST+X9KCkVZLOrlNnuqSnJC1Lry+0Kda1klakGLrrrJeky9PwKcslTWtxfK/PnaNlkp6WdE6/Om05l5LmStooaWWubHdJiyStTu9jG2w7M9VZLWlmi2P8iqSH0t/nfEm7Ndi26XejBXFeIOnx3N/rMQ22bckQPw1i/G4uvrWSljXYtpXnsu7vT8u+mxHhV8kXMB6YlpZ3Af4L2L9fnenAjzog1rXAHk3WHwP8hOwZloOBu9sY6xjg18BrO+FcAocC04CVubJ/BGan5dnAl+tstzvwSHofm5bHtjDGI4Ht0/KX68VY5rvRgjgvAP62xHdiDbAvsCNwf/9/a1XG2G/9pcAXOuBc1v39adV301cWAxAR6yPi3rS8CXiQ7Ony4WgGcE1kFgO7SRrfpliOANZExC/bdPyXiIg7gCf6Fc8Ark7LVwPH19n0fcCiiHgiIp4EFgFHtSrGiLg1Ijanj4vJnkdqqwbnsowXhviJiD8CtSF+hlyzGCUJ+BAwr4pjD0ST35+WfDedLAZJ0mTgLcDddVa/XdL9kn4i6YCWBvaiAG6VtDQNg9JfvSFU2pX4TqLxP8ZOOJcAe0XEesj+0QJ71qnTSef042RXjvUUfTda4cx0u2xug9smnXIu3wVsiIjVDda35Vz2+/1pyXfTyWIQJO0M3AicExFP91t9L9ntlDcD/wLc1Or4kkMiYhrZSL1nSDq03/pSQ6hULT1keRzwvTqrO+VcltUp5/RzwGbgugZVir4bVbsC2A+YCqwnu83TX0ecS+Bkml9VtPxcFvz+NNysTtmAzqeTxQBJ2oHsL+q6iPhB//UR8XREPJOWFwI7SNqjxWESEevS+0ZgPtllfV6nDKFyNHBvRGzov6JTzmWyoXabLr1vrFOn7ec0NVy+H/jrSDer+yvx3ahURGyIiC0R8TxwZYPjd8K53B44EfhuozqtPpcNfn9a8t10shiAdP/yKuDBiLisQZ0/S/WQdBDZOf5t66IESa+UtEttmazhc2W/aguAU1OvqIOBp2qXsi3W8H9unXAucxYAtR4kM4Gb69S5BThS0th0a+XIVNYSko4CPgscFxHPNqhT5rtRqX5tYyc0OH4nDPHzHuChiOitt7LV57LJ709rvputaMUfKS/gnWSXbsuBZel1DPAp4FOpzpnAKrLeG4uBd7Qhzn3T8e9PsXwulefjFNmEUmuAFUBXG+J8BdmP/6tyZW0/l2TJaz3wJ7L/kZ0OvBq4DVid3ndPdbuAf8tt+3GgJ71Oa3GMPWT3pWvfzW+luq8BFjb7brQ4zmvTd2452Q/d+P5xps/HkPX4WVNlnPViTOXfrn0Xc3XbeS4b/f605Lvp4T7MzKyQb0OZmVkhJwszMyvkZGFmZoWcLMzMrJCThZmZFXKysGFJ0ha9dNTayQPYdrqkHw1RHMcNxYiokn6RRlhdrmzk2K8rN2qspP9XsP152xqDWTPuOmvDkqRnImLnJuu3jxcH1eu/bjrZqKfvryq+gZL0C7KYutNDaBeTPfvy7pLbNz0fZtvKVxY2Ykj6mKTvSfoh2eBuUjbHw8o058CHc9V3VTbnwwOSviVpu7SPIyXdJenetK+dU/laSV9M5Ssk/UXumF9Py9+W9IFcPM+k9/GS7khXQCslvavZnyOyUVb/Dthb0puL9iXpEmCnVHZdqndTGtxuVX6AO0nPSLpI2eCMiyXtlcr3Sufj/vR6Ryr/qKQlad//KmnMNvwV2TDmZGHDVe3HcZmk+bnytwMzI+JwsnF9pgJvJhu64Su5oSYOAj4DvJFsULsTlY079XngPZENDtcN/E1u379J5VcAfzuAWD8C3BIRtVjqTqSTFxFbyJ4M/ouifUXEbOD3ETE1Iv461ft4RBxI9hTvWZJencpfCSyObHDGO4BPpPLLgdtT+TRglaS/BD5MNljeVGALUNu/jTLbtzsAs0H6ffoB629RRNTmJngnMC/98G6QdDvwVuBpYElEPAIgaV6q+weyyWT+Mw1JtSNwV27ftYHblpIlorLuAeYqGwTupogoTBZJvZFCy+7rLEknpOVJwBSyoVX+CNTaa5YC703LhwOnwguJ6ilJpwAHAvek87ET9Qeps1HAVxY20vwut1zvx7amf2NdpPqL0v/Qp0bE/hFxeq7Oc+l9C/X/o7WZ9G8qDfq2I7wwuc6hwOPAtZJOLfpDpNs9bySb4ObFIEvsK7XJvAd4e7pSuA94eVr9p3ixobLRn+OFXQFX587H6yPigqLYbWRysrCR7A7gw5LGSBpH9iO7JK07KI1ouh3ZrZY7yQYrPETSnwNIeoWk1w3geGvJ/icO2exlO6T9vBbYGBFXko0a2nS+83TVcDHwWEQs77eu0b7+lLYDeBXwZEQ8m9pWDi4R+23Ap9MxxkjaNZV9QNKeqXz3dHwbhZwsbCSbTzZC5/3A/wH+LiJ+ndbdBVxCNqT0o8D8iOgDPgbMk7ScLHn0bzNo5krg3ZKWAG/jxauc6cAySfcBfwX8c4Ptr0vHXUnWtlBvGtFG+5oDLE8N3D8Ftk/7ujD9OYqcDRwmaQXZ7akDIuIBsjacW9O+FpHNA22jkLvOmplZIV9ZmJlZIScLMzMr5GRhZmaFnCzMzKyQk4WZmRVysjAzs0JOFmZmVuj/A0Kq1zSCggDyAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "similarity=similarity.flatten()\n",
    "print(similarity.shape)\n",
    "\n",
    "# cleanup self-compare and dup-compare\n",
    "s = similarity > 0\n",
    "s = np.unique(similarity[s])\n",
    "plt.xlabel('Frobenius Distance')\n",
    "plt.ylabel('Number of relation pairs')\n",
    "plt.hist(s)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
